Predicting underlying pitch targets for intonation modeling

نویسنده

  • Xuejing Sun
چکیده

The present paper reports our preliminary attempt on modeling intonation using underlying pitch targets. The underlying pitch targets were derived using a nonlinear regression technique under the pitch target approximation model [17, 19]. We assume that the use of underlying pitch targets can capture the most important intonation patterns while maintaining critical predictive power. Another important aspect of our approach is that we do not rely on pitch accent as a component in the system. To predict the parameters of the underlying targets, we used a recurrent neural network combined with a time-delay window. Comparing the predicted and original pitch targets, the root mean square error (RMSE) is 7.90 Hz, and the correlation coefficient (r) is 0.78. The results are encouraging and suggesting that the use of underlying pitch targets is a promising approach to intonation modeling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pitch Targets Anchor Chinese Tone and Intonation Patterns

This paper presents a study on the role of pitch targets in the manifestation of Chinese tone and intonation. Pitch targets are particularly measured as (fundamental frequency) peaks and valleys over time. Analysis and perceptual experiments were conducted on 72 sentences, each with almost identical tone mapping, uttered two times by a female native in statements or questions. The tone and into...

متن کامل

Modeling Improved Prosody Generation from High-Level Linguistically Annotated Corpora

Synthetic speech usually suffers from bad F0 contour surface. The prediction of the underlying pitch targets robustly relies on the quality of the predicted prosodic structures, i.e. the corresponding sequences of tones and breaks. In the present work, we have utilized a linguistically enriched annotated corpus to build data-driven models for predicting prosodic structures with increased accura...

متن کامل

Maximum-likelihood dynamic intonation model for concatenative text-to-speech system

In this work we present a Maximum Likelihood (ML) joint pitch curve modeling, inspired by HMM TTS synthesis concept. This model provides an optimal solution for the coarse target intonation curve (3 points per syllable) and incorporates both static and dynamic pitch values for better utterance intonation modeling. The coarse intonation curve may be optionally combined with the original pitch ex...

متن کامل

Intonation Components in short English Statements

In this study we attempt to identify the basic components of statement intonation as related t o focus, accent and lexical stress in General American English. Instead of viewing f 0 contours as direct acoustic correlates of intonation components, we regard them as the outcome of implementing different functional components of intonation under various articulatory constraints. Eight American Eng...

متن کامل

Timing of experimentally elicited minimal responses as quantitative evidence for the use of intonation in projecting TRPs

In an RT experiment, subjects were asked to respond with minimal responses to prerecorded dialogs and a manipulated version of these dialogs that contained only intonation and pause information. Response delays and, especially, variances were higher to the impoverished, intonation only, stimuli than to the original recordings. It was also found that intonation only utterances ending in a mid-fr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001